multi-label ranking
- Asia > China > Jiangsu Province > Nanjing (0.04)
- Oceania > Australia > Australian Capital Territory > Canberra (0.04)
Predicting Label Distribution from Multi-label Ranking
Label distribution can provide richer information about label polysemy than logical labels in multi-label learning. There are currently two strategies including LDL (label distribution learning) and LE (label enhancement) to predict label distributions. LDL requires experts to annotate instances with label distributions and learn a predictive mapping on such a training set. LE requires experts to annotate instances with logical labels and generates label distributions from them. However, LDL requires costly annotation, and the performance of the LE is unstable. In this paper, we study the problem of predicting label distribution from multi-label ranking which is a compromise w.r.t.
Rethinking and Reweighting the Univariate Losses for Multi-Label Ranking: Consistency and Generalization
The (partial) ranking loss is a commonly used evaluation measure for multi-label classification, which is usually optimized with convex surrogates for computational efficiency. Prior theoretical efforts on multi-label ranking mainly focus on (Fisher) consistency analyses. However, there is a gap between existing theory and practice --- some inconsistent pairwise losses can lead to promising performance, while some consistent univariate losses usually have no clear superiority in practice. To take a step towards filling up this gap, this paper presents a systematic study from two complementary perspectives of consistency and generalization error bounds of learning algorithms. We theoretically find two key factors of the distribution (or dataset) that affect the learning guarantees of algorithms: the instance-wise class imbalance and the label size $c$. Specifically, in an extremely imbalanced case, the algorithm with the consistent univariate loss has an error bound of $O(c)$, while the one with the inconsistent pairwise loss depends on $O(\sqrt{c})$ as shown in prior work. This may shed light on the superior performance of pairwise methods in practice, where real datasets are usually highly imbalanced. Moreover, we present an inconsistent reweighted univariate loss-based algorithm that enjoys an error bound of $O(\sqrt{c})$ for promising performance as well as the computational efficiency of univariate losses. Finally, experimental results confirm our theoretical findings.
UniMLR: Modeling Implicit Class Significance for Multi-Label Ranking
Yesilkaynak, V. Bugra, Dari, Emine, Mertan, Alican, Unal, Gozde
Existing multi-label ranking (MLR) frameworks only exploit information deduced from the bipartition of labels into positive and negative sets. Therefore, they do not benefit from ranking among positive labels, which is the novel MLR approach we introduce in this paper. We propose UniMLR, a new MLR paradigm that models implicit class relevance/significance values as probability distributions using the ranking among positive labels, rather than treating them as equally important. This approach unifies ranking and classification tasks associated with MLR. Additionally, we address the challenges of scarcity and annotation bias in MLR datasets by introducing eight synthetic datasets (Ranked MNISTs) generated with varying significance-determining factors, providing an enriched and controllable experimental environment. We statistically demonstrate that our method accurately learns a representation of the positive rank order, which is consistent with the ground truth and proportional to the underlying significance values. Finally, we conduct comprehensive empirical experiments on both real-world and synthetic datasets, demonstrating the value of our proposed framework. Code is available at https://github.com/MrGranddy/UniMLR.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
- Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Predicting Label Distribution from Multi-label Ranking
It is obvious that Eq. (5) holds for k = 2 . The information of the datasets we used is shown in Table 1. The first four rows in Table 1 are the existing label distribution datasets; the last three rows in Table 1 are the datasets we created. Since some examples in the original label distribution datasets do not satisfy the prerequisites of our paper (i.e., there are some examples
- Asia > China > Jiangsu Province > Nanjing (0.04)
- Oceania > Australia > Australian Capital Territory > Canberra (0.04)
Predicting Label Distribution from Multi-label Ranking
Label distribution can provide richer information about label polysemy than logical labels in multi-label learning. There are currently two strategies including LDL (label distribution learning) and LE (label enhancement) to predict label distributions. LDL requires experts to annotate instances with label distributions and learn a predictive mapping on such a training set. LE requires experts to annotate instances with logical labels and generates label distributions from them. However, LDL requires costly annotation, and the performance of the LE is unstable.
Rethinking and Reweighting the Univariate Losses for Multi-Label Ranking: Consistency and Generalization
The (partial) ranking loss is a commonly used evaluation measure for multi-label classification, which is usually optimized with convex surrogates for computational efficiency. Prior theoretical efforts on multi-label ranking mainly focus on (Fisher) consistency analyses. However, there is a gap between existing theory and practice --- some inconsistent pairwise losses can lead to promising performance, while some consistent univariate losses usually have no clear superiority in practice. To take a step towards filling up this gap, this paper presents a systematic study from two complementary perspectives of consistency and generalization error bounds of learning algorithms. We theoretically find two key factors of the distribution (or dataset) that affect the learning guarantees of algorithms: the instance-wise class imbalance and the label size c .
RLSEP: Learning Label Ranks for Multi-label Classification
Dari, Emine, Yesilkaynak, V. Bugra, Mertan, Alican, Unal, Gozde
Multi-label ranking maps instances to a ranked set of predicted labels from multiple possible classes. The ranking approach for multi-label learning problems received attention for its success in multi-label classification, with one of the well-known approaches being pairwise label ranking. However, most existing methods assume that only partial information about the preference relation is known, which is inferred from the partition of labels into a positive and negative set, then treat labels with equal importance. In this paper, we focus on the unique challenge of ranking when the order of the true label set is provided. We propose a novel dedicated loss function to optimize models by incorporating penalties for incorrectly ranked pairs, and make use of the ranking information present in the input. Our method achieves the best reported performance measures on both synthetic and real world ranked datasets and shows improvements on overall ranking of labels. Our experimental results demonstrate that our approach is generalizable to a variety of multi-label classification and ranking tasks, while revealing a calibration towards a certain ranking ordering.
- North America > United States > Vermont (0.04)
- Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
Multi-label Ranking: Mining Multi-label and Label Ranking Data
Multi-label ranking (MLR) is the problem of predicting and ranking multiple labels for a single instance. The predicted labels are known as the instance's labelset. MLR can be typically reduced to two sub-problems: The first is multi-label classification, where the task is to bipartite the data into relevant labels (the labelset) and irrelevant labels. The second is label ranking classification, where the task is to rank labels for each instance. A label ranking may contain ties; in the extreme case relevant labels hold a tie on first place, and irrelevant labels hold a tie on second place, thus turning the label ranking classification into a multi-label one.
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- (5 more...)
- Research Report (1.00)
- Overview (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
- Health & Medicine > Nuclear Medicine (0.93)